Update bos_token #4806

aacedar · 2025-07-02T11:56:34Z

PR type

[] Bug Fix, issue

PR information

qwen模型的encode 不管add_special_tokens=True/False，都不会输出special token,如 <|im_start|>, qwen模型这些特殊的token会在拼接的时候补上，无需额外处理
所以需要判断bos_token是否为空
2. 若bos_token不为空，则表示需要加specal token，bos_token = all_tokens[:idx]得到的是列表，在执行prompts_text.append(''.join(res_context_list))时
会因为res_context_list[0]的类型是list而报错，并且这里的bos_token是token_id而不是token，在encode时也会出错，所以采用self.tokenizer.bos_token更为合适

hjh0119 · 2025-07-02T13:25:46Z

Nice catch!

Should we modify the function _apply_chat_template_to_messages_list to make it compatible with eos_token?

For example:

    def _apply_chat_template_to_messages_list(self, messages_list: InputsType):
        prompts_text = []
        for messages in messages_list:
            InferRequest.remove_response(messages)
            template_inputs, _ = StdTemplateInputs.from_dict({'messages': messages})
            res_context_list, _, _ = self.template._swift_encode(template_inputs)
-           prompts_text.append(''.join(res_context_list))
+           prompts_text.append(''.join(elem for elem in res_context_list if isinstance(elem, str)))

        return prompts_text

aacedar · 2025-07-02T14:02:39Z

@hjh0119 I have read the full self.template._swift_encode(template_inputs) function, the return result is list, if the element of res_context_list is list, this is not corrent result, here should raise an exception instead of ignore the error
so, I think here not judge if isinstance(elem, str), just throw an exception to remind user

aacedar · 2025-07-03T02:22:55Z

if elem type is list, that means the elem is token_id value, not text value.
for example: tokenize.encode("<｜begin▁of▁sentence｜><｜end▁of▁sentence｜>" you will get token_id list, but it is str type befor tokenize.encode().
The elem of res_context_list should all be of type str, and then encoded uniformly in subsequent processing。

hjh0119 · 2025-07-03T02:52:59Z

I believe the related issue arises because template_meta.prefix contains the BOS token as an integer, which leads to a join failure.

Therefore, I think ignoring integer values in the _apply_chat_template_to_messages_list method is a safer approach, since this method is only used for logging purposes and omitting the BOS token will not have any impact.

Are there any training scripts that fail to run on the main branch but work with this PR?

aacedar · 2025-07-03T03:07:38Z

yes, the related issue arises because template_meta.prefix , I can not agree with you any more. you can see my other pr， this pr solve template_meta.prefix bug。#4813

hjh0119 · 2025-07-03T03:21:49Z

so are there any training/infer scripts that fail to run on the main branch but work with this PR?

aacedar · 2025-07-03T04:48:53Z

just replace your_own_path in --model, --output_dir, you can test

CUDA_VISIBLE_DEVICES=7 \

swift rollout \

  --model your_own_path/deepseek-ai/deepseek-coder-6.7b-base \

 > logs/log-grpo-rollout.log 2>&1 &



CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 \

NPROC_PER_NODE=6 \

swift rlhf \

  --rlhf_type grpo \

  --model your_own_path/deepseek-ai/deepseek-coder-6.7b-base \

  --reward_funcs accuracy \

  --use_vllm true \

  --vllm_mode server \

  --vllm_server_host 127.0.0.1 \

  --vllm_server_port 8000 \

  --train_type full \

  --torch_dtype bfloat16 \

  --dataset AI-MO/NuminaMath-TIR#1000 \

  --split_dataset_ratio 0 \

  --max_completion_length 512 \

  --num_train_epochs 1 \

  --per_device_train_batch_size 2 \

  --learning_rate 1e-6 \

  --gradient_accumulation_steps 2 \

  --save_total_limit 2 \

  --logging_steps 1 \

  --deepspeed zero2 \

  --max_length 4096 \

  --warmup_ratio 0.05 \

  --dataloader_num_workers 2 \

  --dataset_num_proc 4 \

  --num_generations 6 \

  --temperature 0.9 \

  --top_p 0.9 \

  --top_k 50 \

  --log_completions true \

  --num_iterations 1 \

  --beta 0.01 \

  --output_dir ./ds-base-grpo \

  --report_to tensorboard \

  > logs/log-grpo-test.log 2>&1 &

hjh0119 · 2025-07-03T09:34:35Z

I've tested the script you provided using your commit, but it still raises the following error: TypeError: sequence item 0: expected str instance, list found.

tastelikefeet · 2025-07-03T09:41:56Z

swift/llm/template/base.py

@@ -1039,8 +1039,16 @@ def _swift_encode(self, inputs: StdTemplateInputs):
            idx = all_tokens.index(single_token[0])
            bos_token = all_tokens[:idx]
            sep_token = all_tokens[idx + 1:]
+            """


auto_add_bos is False for Qwen models
So the encode logic of Qwen models will not reach here

this function will not work for Qwen models because the qwen.py:QwenTemplateMeta class set auto_add_bos = False, it will be raise exception for deepseek model

swift/llm/template/template/qwen.py

@dataclass class QwenTemplateMeta(ChatmlTemplateMeta): default_system: Optional[str] = DEFAULT_SYSTEM auto_add_bos: bool = False stop_words: List[Word] = field(default_factory=lambda: ['<|endoftext|>']) agent_template: str = 'hermes'

aacedar · 2025-07-03T11:49:41Z

I've tested the script you provided using your commit, but it still raises the following error: TypeError: sequence item 0: expected str instance, list found.

you should fix #4806 bug first, I have commit pr for this bug

Update bos_token

177e668

hjh0119 mentioned this pull request Jul 3, 2025

Update template_meta.prefix bug #4813

Open

tastelikefeet reviewed Jul 3, 2025

View reviewed changes

aacedar closed this Jul 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update bos_token #4806

Update bos_token #4806

Uh oh!

aacedar commented Jul 2, 2025

Uh oh!

hjh0119 commented Jul 2, 2025

Uh oh!

aacedar commented Jul 2, 2025

Uh oh!

aacedar commented Jul 3, 2025

Uh oh!

hjh0119 commented Jul 3, 2025

Uh oh!

aacedar commented Jul 3, 2025 •

edited

Loading

Uh oh!

hjh0119 commented Jul 3, 2025

Uh oh!

aacedar commented Jul 3, 2025 •

edited

Loading

Uh oh!

hjh0119 commented Jul 3, 2025

Uh oh!

tastelikefeet Jul 3, 2025

Uh oh!

aacedar Jul 3, 2025 •

edited

Loading

Uh oh!

aacedar commented Jul 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

Update bos_token #4806

Update bos_token #4806

Uh oh!

Conversation

aacedar commented Jul 2, 2025

PR type

PR information

Uh oh!

hjh0119 commented Jul 2, 2025

Uh oh!

aacedar commented Jul 2, 2025

Uh oh!

aacedar commented Jul 3, 2025

Uh oh!

hjh0119 commented Jul 3, 2025

Uh oh!

aacedar commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hjh0119 commented Jul 3, 2025

Uh oh!

aacedar commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hjh0119 commented Jul 3, 2025

Uh oh!

tastelikefeet Jul 3, 2025

Choose a reason for hiding this comment

Uh oh!

aacedar Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aacedar commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

aacedar commented Jul 3, 2025 •

edited

Loading

aacedar commented Jul 3, 2025 •

edited

Loading

aacedar Jul 3, 2025 •

edited

Loading

aacedar commented Jul 3, 2025 •

edited

Loading